Discover Relevant Sources : A Multi-Armed Bandit Approach

نویسندگان

  • Onur Atan
  • Mihaela van der Schaar
چکیده

Existing work on online learning for decision making takes the information available as a given and focuses solely on choosing the best actions given this information. Instead, in this paper, the decision maker needs to simultaneously learn both what decisions to make and what source(s) of information to consult/gather data from in order to inform its decisions such that its reward is maximized. We formalize this dual-learning and online decision making problem as a multi-armed bandit problem. If it were known in advance which sources were relevant for which decisions, the problem would be simple but they are not. We propose algorithms that discover the relevant source(s) over time, while simultaneously learning what actions to take based on the information revealed by the selected source(s). Our algorithm resembles that of the well-known UCB algorithm but adds to it the online discovery of what specific sources are relevant to consult to inform specific decisions. We prove logarithmic regret bounds and also provide a matching lower bound on the number of times a wrong source is selected, which is achieved by RSUCB for specific cases. The proposed algorithm can be applied in many applications including clinical decision assist systems for medical diagnosis, recommender systems, actionable intelligence, etc. where observing the complete information of a patient or a consumer or consulting all the available sources to gather intelligence is not feasible.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Anytime many-armed bandits

This paper introduces the many-armed bandit problem (ManAB), where the number of arms is large comparatively to the relevant number of time steps. While the ManAB framework is relevant to many real-world applications, the state of the art does not offer anytime algorithms handling ManAB problems. Both theory and practice suggest that two problem categories must be distinguished; the easy catego...

متن کامل

Trading Off Scientific Knowledge and User Learning with Multi-Armed Bandits

The rise of online educational software brings with it the ability to run experiments on users quickly and at low cost. However, education is a dual-objective domain: not only do we want to discover general educational principles, we also want to teach students as much as possible. In this paper, we propose an automatic method for allocating experimental samples, based on multi-armed bandit alg...

متن کامل

Instrument-Armed Bandits

We extend the classic multi-armed bandit (MAB) model to the setting of noncompliance, where the arm pull is a mere instrument and the treatment applied may differ from it, which gives rise to the instrument-armed bandit (IAB) problem. The IAB setting is relevant whenever the experimental units are human since free will, ethics, and the law may prohibit unrestricted or forced application of trea...

متن کامل

Showing Relevant Ads via Context Multi-Armed Bandits

We study context multi-armed bandit problems where the context comes from a metric space and the payoff satisfies a Lipschitz condition with respect to the metric. Abstractly, a context multi-armed bandit problem models a situation where, in a sequence of independent trials, an online algorithm chooses an action based on a given context (side information) from a set of possible actions so as to...

متن کامل

A Multi-armed Bandit to Smartly Select a Training Set from Big Medical Data

With the availability of big medical image data, the selection of an adequate training set is becoming more important to address the heterogeneity of different datasets. Simply including all the data does not only incur high processing costs but can even harm the prediction. We formulate the smart and efficient selection of a training dataset from big medical image data as a multi-armed bandit ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015